Antipattern: A Shortcut That Gets You Lost
Let’s understand how using implicit columns creates issues in a database.
We'll cover the following
Although using wildcards and unnamed columns satisfies the goal of less typing, this habit creates several hazards.
Breaking refactoring#
Let’s suppose we need to add a column to the Bugs
table, such as date_due
, for scheduling purposes.
But our INSERT
statement results in an error because we listed eleven values instead of the twelve the table now expects.
In an INSERT
statement that uses implicit columns, we must give values for all the columns in the same order that the columns are defined in the table. If the columns change, the statement produces an error or assigns values to the wrong columns.
Suppose we run a SELECT *
query, and since we don’t know the column names, we reference the columns based on their ordinal position:
<?php
$stmt = $pdo->query("SELECT * FROM Bugs WHERE bug_id = 1234");
$row = $stmt->fetch();
$hours = $row[10];
?>
But unknown to us, another person on the team dropped a column:
Since, the hours
column is no longer at position 10, our application is using the value in another column by mistake. As columns are renamed, added, or dropped, our query result could change in ways our code doesn’t support. We can’t predict how many columns our query returns if we use a wildcard.
These errors can propagate through our code, and by the time we notice the problem in the output of the application, it’s hard to trace back to the line where the mistake occurred.
Hidden costs#
The convenience of using wildcards in queries can harm performance and scalability. The more columns our query fetches, the more data must travel over the network between our application and the database server.
We often have many queries running concurrently in our production application environment. They compete for access to the same network bandwidth. Even a gigabit network can be saturated by a hundred application clients querying for thousands of rows at a time.
Object-relational mapping (ORM) techniques such as Active Record often use SELECT *
by default to populate the fields of an object representing a row in a database. Even if the ORM offers the means to override this behavior, most programmers don’t bother.
Another limitation of wildcards#
Programmers sometimes ask, “Is there a shortcut to request all columns, except a couple that I specify?” Perhaps these programmers are trying to avoid the resource cost of fetching bulky TEXT
columns that they don’t need, but they still want to utilize the convenience of using a wildcard.
However, the answer to this question is no. SQL does not support any syntax that means, “all the columns I want but none that I don’t want.” We have to either use the wildcard to request all columns from a table or we have to explicitly list the columns that we want.